Wang et al. Journal of Orcadian Rhythms 201 1, 9:1 1 
http://www.jcircadianrhythms.eom/content/9/1/1 1 



RESEARCH Open Access 



Measuring the impact of apnea and obesity on 
circadian activity patterns using functional linear 
modeling of actigraphy data 

Jia Wang 1 , Hong Xian 1,2 , Amy Licis 3 , Elena Deych 1 , Jimin Ding 4 , Jennifer McLeland 3 , Cristina Toedebusch 3 , Tao Li 1 , 
Stephen Duntley 3 and William Shannon 1 " 

Abstract 

Background: Actigraphy provides a way to objectively measure activity in human subjects. This paper describes a 
novel family of statistical methods that can be used to analyze this data in a more comprehensive way. 

Methods: A statistical method for testing differences in activity patterns measured by actigraphy across subgroups 
using functional data analysis is described. For illustration this method is used to statistically assess the impact of 
apnea-hypopnea index (apnea) and body mass index (BMI) on circadian activity patterns measured using 
actigraphy in 395 participants from 18 to 80 years old, referred to the Washington University Sleep Medicine 
Center for general sleep medicine care. Mathematical descriptions of the methods and results from their 
application to real data are presented. 

Results: Activity patterns were recorded by an Actical device (Philips Respironics Inc.) every minute for at least 
seven days. Functional linear modeling was used to detect the association between circadian activity patterns and 
apnea and BMI. Results indicate that participants in high apnea group have statistically lower activity during the 
day, and that BMI in our study population does not significantly impact circadian patterns. 

Conclusions: Compared with analysis using summary measures (e.g., average activity over 24 hours, total sleep 
time), Functional Data Analysis (FDA) is a novel statistical framework that more efficiently analyzes information from 
actigraphy data. FDA has the potential to reposition the focus of actigraphy data from general sleep assessment to 
rigorous analyses of circadian activity rhythms. 

Keywords: Apnea, BMI, circadian activity patterns, Functional Data Analysis 



1. Introduction 

Activity measured by wrist actigraphy has been shown 
to be a valid marker of entrained Polysomnography 
(PSG) sleep phase and is strongly correlated with 
entrained endogenous circadian phase [1]. Actigraphy 
data is recorded densely, such as every minute or every 
15 seconds, for each patient over multiple days. This 
data is generally analyzed by reducing the time series 
activity values to summary statistics such as sleep/wake 
ratios, [2,3] total sleep time, [2,4] sleep efficiency, [5,6] 
wake after sleep onset, [2,3,6] ratio of nighttime activity 
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to daytime activity or total activity, [7,8] standard devia- 
tion of sleep onset time, [9] and intra-daily variability 
[10]. More complex modeling of actigraphy includes 
spectral analysis, [7] cosinor analysis [7] and waveform 
eduction calculated as an "average waveform" for some 
period [11]. 

In this paper we propose a novel statistical framework, 
Functional Linear Modeling (FLM), a subset of Func- 
tional Data Analysis (FDA), for analyzing actigraphy 
data to extract and analyze circadian activity informa- 
tion through direct analysis of raw activity values [12]. 
FLM extends standard linear regression to the analysis 
of functions, which in this case represent circadian 
activity patterns. FLM is performed by 1) converting a 
subject's raw actigraphy data to a functional form (i.e., 
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continuous curve over time), and 2) analyzing sets of 
functions to see if they differ statistically across groups. 
Our FLM-based analysis shows where and with what 
level the difference between groups occurs along the 
time, which provides valuable reference for clinical ana- 
lysis and treatments, and distinguishes our methods 
from existing circadian analysis works (see [13] for a 
review). Moreover, we adopted a non-parametric per- 
mutation F test to detect the difference between groups, 
which makes the results robust to the uncertainty in 
raw data distribution. Using FLM, we show that the 
apnea-hypopnea index (apnea) has a statistically signifi- 
cant impact on circadian activity patterns, while body 
mass index (BMI) in this dataset has little impact. 

2. Methods 

2.1 Participants and Measures 

Participants were recruited prospectively from the clinic 
at Washington University in St. Louis Sleep Medicine 
Center. The sleep center is a multidisciplinary clinic at a 
tertiary medical facility. Clinic patients with a suspected 
diagnosis of obstructive sleep apnea (OSA), insomnia, or 
restless legs syndrome (RLS) were invited to participate. 
Pregnant women, individuals under age of 18, and 
patients who report working an evening or overnight 
shift were excluded from participation due to known 
biologically different circadian clocks. Clinical covariates 
such as BMI, co-morbidities, concomitant medications, 
and presenting sleep complaints were collected. Partici- 
pants underwent an overnight PSG when clinically indi- 
cated. These data were collected in accordance with the 
standards of the American Academy of Sleep Medicine 
(AASM) and were reviewed by a board certified sleep 
physician. PSG data were scored according to the 
AASM Manual for the Scoring of Sleep and Associated 
Events. This ongoing study has been approved by the 
Washington University School of Medicine Institutional 
Review Board. 

Activity was measured using Actical devices (Philips 
Respironics Inc.) which were positioned on the non-domi- 
nant wrist of subjects at the initial sleep center visit and 
set to measure activity every minute for 7 days. Three 
hundred and ninety five patients have been recruited, of 
which 305 have apnea and/or BMI measured. This sub- 
group comes from a larger NIH funded study currently 
recruiting a cross section of 750 patients referred to the 
Washington University Sleep Medicine Center for the pur- 
pose of developing and validating functional data analysis 
methods for actigraphy data (HL092347). 

2.2. Functional Data Analysis (FDA) 

FDA is an emerging field in statistics that extends classi- 
cal statistical methods for analyzing sets of numbers 
(scalars for univariate analyses, and vectors for 



multivariate analyses) to analyzing sets of functions [13] 
[15]. FDA is a subset of the larger field called 'object 
data analysis' or 'object oriented data analysis' that uses 
statistical methods to analyze data that are in non- 
numeric form such as images, graphs (e.g., trees), or 
functions [14,15]. The goal of object oriented data ana- 
lysis is to analyze objects in their natural form (e.g., 
functions, graphs) to extract more information than 
generally can be extracted when the objects are con- 
verted into simpler summary measures (e.g., average 
activity level, total sleep time) where standard statistical 
methods can be applied. 
2.2. 1 Functional smoothing 

Functional data analysis (FDA) begins by replacing dis- 
crete activity values measured at each time unit (e.g., 
minute) by a function to model the data and reduce 
variability. The function represents the expected activity 
value at each time point measured. Since the actigraphy 
has equidistant data, to allow flexibility in representing 
the data as a function, a Fourier expansion model is 
used, though any smoothing method could be used. Let 
y k j be the discrete activity count for patient k at time 
point t k p then the model 

y k j = Activity k (t kj ) + s k (t kj ) (1) 

represents activity, where k = 1, 2,...,A/,Af is total num- 
ber of patients, ; = 1, 2,...,r /c , T k is the total number of 
time points for patient k. In our dataset, observation 
times are minutes from midnight to midnight in 24 
hours, so all subjects have the same number of measure- 
ments T k . 

We convert the raw actigraphy data to a functional 
form using a basis function expansion for Activity k {tj) 

Activity k {tj) = tfifcOi (t ; ) + a 2 fc0 2 (tj) ^) 

+ ••• +a nk <&n{tj) 

where {tfffc}" =1 are scalar coefficients for patient k and 

{0 1 (-)}J 2 =1 are basis functions. Possible basis functions 

include polynomials (f(t) = a x t + a 2 t l + ... + a^), Four- 

(f(t) = a\ + a 2 sin(o;t) + a 3 cos(cot)+ 
ier basis . , splines, 

#4 s'm(2cot) + as cos(2o;t) + • • • + a n (p n ) 

and wavelets. 

Experimental results (unpublished) show most basis 
functions work equally well and we have found a Four- 
ier expansion with n = 9 basis functions capture the 
major trend of activity pattern with reduced noise. Let 

Oi(t) = 1, 0 2 (t) = cos{cot) f 0 3 (t) = 

2n 

sm(cot), . . . , O s (t) = cos(4<x>t), 0 9 (t) = sm(4cot), co = — 

where T is the period, in our case T = 1440 (number 
of minutes in 24 hours). Equation 1 becomes 
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9 

Activity k {tj) = 

i=i 

We will use this functional representation for all ana- 
lyses in this paper. 

Smooth coefficients of the expansion {dik}? =i are esti- 
mated by minimizing the unweighted least squares cri- 
terion SMSSE [12]: 

J440 ^-^9 , x 2 

SMSSE (y k \a k ) = J2 j=1 fo* " H i=l a ^ (3) 

where y k = (y lh y 2k >.«>yi440k) > = (&ih a 2k ,...,a 9k )'. 
In matrix terms, this criterion becomes: 

SMSSE (y k I a k ) = {y k - ^a k )\y k - &a k ). (4) 

where O is a 1440 x 9 matrix with columns for basis 
functions and rows for basis value at each minute. 

Taking the derivative of the criterion SMSSE(y k \a k ) 
with respect to a, gives 20 0^^ - 1<S>y k , and setting this 
equal to 0 and solving for a provides the estimate a 
that minimizes the least square solution, 

a k = (O'fcJ'Vyfe. (5) 

Then, the vector y of smoothed activity fitted values 

is 

yk = OSfe = ^(O'O) -1 ^')^ (6) 



The raw data does not need to be normalized since all 
analyzes are done on the functional form of the data. 

To avoid introducing variation between weekday and 
weekend activity patterns, only data from midnight 
Monday to midnight Friday was used in this paper, 
although this simplification is not required for analysis. 
The five weekdays of actigraphy data were averaged into 
a single 24 hour profile and a smooth Fourier expansion 
function was fitted using a 24 hour periodicity and 9 
basis functions. This produced a single 24 hour circa- 
dian activity pattern for each subject that can be used to 
estimate patient's activity level at any time point 
throughout the day. We are developing and preparing to 
publish functional linear mixed models which will ana- 
lyze every day's activity data to incorporate day effects, 
weekday/weekend effects, and pre/post treatment effects 
which will provide more insight into circadian rhythm 
patterns and within- subject variability. 

This data smoothing method is illustrated in Figure 1 
for a typical subject. Plot (a) shows weekdays ordered 
Monday through Friday from top to bottom, with the 
time of day indicated on the X axis running from mid- 
night to midnight, and the height of the spike indicating 
the raw activity level on the Y axis collected by the acti- 
graphy watch at each minute interval. Plot (b) shows the 
activity averaged at each minute over the 5 days (black 
points) and the Fourier expansion representing this 
patient's circadian activity pattern (red solid line). 
2.2.2 Functional Linear Models 

Reducing actigraphy data to a summary statistic can 
mask differences across groups. For example, if one 
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Figure 1 Data flow for one subject. Plot (a) shows weekdays ordered Monday through Friday from top to bottom, with the time of day 
indicated on the X and the height of the spike indicating the raw activity level on the Y axis. The plot (b) shows the activity averaged at each 
minute over the 5 days (black points) and the Fourier expansion representing this patient's circadian activity pattern (red solid line). 
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group of patients has high activity in the morning and 
low activity in the afternoon, and another group has a 
reversed pattern with the same magnitude of activity, 
low activity in the morning and high activity in the 
afternoon, their average activity may be similar, and a 
significant difference in circadian activity patterns would 
be missed. FLM avoids masking by extending the linear 
regression model to the analysis of smooth functions (i. 
e. circadian activity patterns), and differences such as 
described in this example become apparent. 

The conceptual change going from classical linear 
regression to FLM is that the model regression coeffi- 
cients, (e.g. p 0 , Pi), and error term are functions. To 
illustrate the use of FLMs for analyzing actigraphy data, 
four subjects from our database with the highest apnea 
scores and four subjects with the lowest apnea scores 
were selected, apnea is a measure of apnea-hypopnea 
index used routinely in sleep medicine, and measures 
the severity of sleep apnea with high values indicating 
more severe disease. In Figure 2, the circadian activity 
patterns fitted by Fourier expansion for each of the 8 
subjects are shown in separate plots with time recorded 
on the X axis, and activity level on the Y axis. The top 4 
plots show the high apnea subjects (severe sleep apnea) 
and the bottom 4 plots show the low apnea subjects 
(mild or no sleep apnea). Visually there is a large differ- 
ence between the circadian patterns in the high and low 
apnea subjects. 

Using this subset of subjects, functional smoothing 
and linear modeling is illustrated in this section. In the 
following section the methods are applied to the full 
dataset. 



To test whether high and low apnea patients have dif- 
ferent activity levels, standard approaches would reduce 
each subject's data to an average activity level, and a 
classical statistical method such as linear regression 
would test if these values are the same or different. For 
example, a linear regression model to test if there are 
differences in average activity between the high apnea 
(average activity = 78, 76, 80 and 76) and low apnea 
(average activity = 370, 397, 482 and 421) groups is 
defined as 

Activity k = fi 0 + Pi x AHI + s k . (7) 

where k = 1,2,...,8 are the subjects in Figure 2, apnea is 
the group membership indicator with apnea = 1 for low 
apnea subjects, apnea = -1 for high apnea subjects, and 
s k is the error term. The resulting model fit to this data 
is Activity k = 247.9 + 169.9 x apnea, P < 0.001, and R 2 
= 0.97. The estimated mean activity in the 4 low apnea 
subjects is 247.9 + 169.9 = 417.8, and in the 4 high 
apnea subjects is 247.9 - 169.9 = 78. This statistical ana- 
lysis confirms the clinical belief that apnea impacts 
activity, and confirms what is seen in Figure 2. However, 
it does not tell us when during the day activity levels are 
different. 

Figure 3 illustrates how functional linear modeling is 
applied to actigraphy data to test for differences 
between the two apnea groups, and show where during 
the day those differences occur. Plot (a) shows the 8 
individual circadian activity patterns with blue and red 
line for high and low apnea groups, respectively. The 
overall mean circadian activity pattern is the solid 
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Figure 2 Smoothed activity of 8 subjects fitted by Fourier expansion and shown in separate plots with time recorded on the X axis, 

and activity level on the Y axis. The top 4 plots show the high apnea subjects and the bottom 4 plots show the low apnea subjects, 
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Figure 3 FLM result for 8 subjects. Plot (a) shows the 8 individual 
circadian activity patterns with blue and red line for high and low 
apnea groups, respectively. The overall mean circadian activity 
pattern is the solid black line and the mean circadian activity 
patterns for the high and low apnea groups are thick blue and red 
line, respectively. Plot (b) shows F-test result the red solid curve 
represents the observed statistic F(t) at each time point, the blue 
dashed and dotted lines correspond to a global and point-wise test 
of significance at significant level a = 0.05, respectively. 



black line and the mean circadian activity patterns 
separately for the high and low apnea groups are the 
thick blue and red line, respectively. Plot (a) shows a 
clear separation of the mean circadian activity patterns 
for the two apnea groups and identifies when during 
daytime those curves differ. In addition, circadian 
activity behaviors become apparent with this analysis. 
For example, the maximum activity in the high apnea 
group (thick blue line) occurs in the morning with a 
steady decline in activity the remainder of the day, 
compared to low apnea group (thick red line), the 
maximum activity occurs at about 3 PM and is stable 
from about 9 AM to noon and from about 6 PM to 9 
PM. 

As in the linear regression model described above, we 
are interested in estimating regression coefficients that 
will produce the group-specific mean circadian activity 
patterns, and test if these mean circadian activity pat- 
terns are different across groups. This model, for apnea, 
is defined as [12]: 



Activity k (t) = p 0 (t) + P\{t) x AHI+ 
e k {i),k= 1,2,...N 



(8) 



where the (t) notation indicates functions over the cir- 
cadian period for activity (fitted by the Fourier 



expansion to the actigraphy data for each subject k) 
Activity k (£), the mean circadian activity pattern over all 
subjects po(t), the functional coefficient indicating how 
the mean circadian activity patterns changes for low 
apnea subjects (apnea = 1, po(t) + Pi(t)), or for high 
apnea subjects (apnea = -1, Po(t) - Pi(t))> and s^t) is the 
functional error term. In other words, the low apnea 
group is predicted to have a mean circadian activity pat- 
tern found by adding the two functions Po(t) + Pi(t), 
and the high apnea group is predicted to have a mean 
circadian activity pattern found by subtracting the two 
functions p o (t) - Pi{t). In Figure 3A p o (t) is the thick 
black line representing the overall mean, p o (t) + Pi(t) is 
the thick red line for the mean of the low apnea group, 
and Po(t) - Pi(t) is the thick blue line for the mean of 
the high apnea group. 

Equation 8 can be formulated as a matrix analysis pro- 
blem as described above using a Nx2 design matrix Z 
with rows indicating subjects and columns indicating 
the mean function (column 1) and effects on the activity 
due to apnea level g (column 2). In standard matrix 
notation each row is a vector of l's and -l's indicating if 
the subject belongs to high apnea (1, -1) and low apnea 
(1, 1). The two functional linear coefficients are repre- 
sented in matrix notation as a 'functional vector 



MO 
MO 



and the smoothed functional data represented in 
matrix form by 



Act{t) = 



Activity i (t) ' 
Activity 2(t) 

Activity ^(t) 



where each row represents a subject's fitted activity 
values. Finally, the functional error matrix is defined as 
£ (t) - ( £ i(t)> 8 2(t)i'''i 8 N(t)) • Equation 8 in matrix notation 
becomes, 



Act{t) =Zp(t) + e[i). 



(9) 



The coefficients P(t) are estimated by minimizing a 
least squares estimate 

N 

LMSSE(P) = J2 i Actk W ~ z kP{t)f dt ( 10 ) 

where Z k is the J^ h row of the design matrix Z. 

After we estimate p(t) for P(t) in function linear 
regression, we also want to measure the accuracy of our 
estimation result. We calculate the point-wise 95% con- 
fidence limits for these effects using residuals from the 
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model. This formulation is the same as the standard lin- 
ear model except that instead of numeric coefficients we 
are now estimating functional coefficients defined over 
the 24 hour circadian period. A statistical test of the 
null hypothesis that the circadian activity patterns are 
the same in both groups is given by the function [12]: 

Var[{Z$) k {t)] 

F{t) = 1 2 (ID 

- £ (Act k (t) - (Z0) fc (t)) 

N fe=l 

where Z is the design matrix and p is a vector of the 
estimated regression coefficient functions. 

Because of the nature of functional statistics, it is diffi- 
cult to attempt to derive a theoretical null distribution 
for any given test statistic. Instead, we applied a non- 
parametric permutation test methodology. If there is no 
relationship between activity pattern and apnea levels, it 
should make no difference if we randomly rearrange the 
apnea group assignment. The advantage of this is that 
we no longer need to rely on distributional assumptions 
while the disadvantage is that we cannot test for the sig- 
nificance of an individual covariate among many. The p 
value of the test can then be calculated by counting the 
proportion of permutation F values that are larger than 
the F statistics for the observed pairing. Here we used 
two different ways to counting the proportion: global 
test and point-wise test. Global test provides a single 
number which is the proportion of maximized F values 
from each permutation. Point-wise test provides a curve 
which is the proportion of all permutation F values at 
each time point. 

Plot (b) in Figure 3 provides a display for the statisti- 
cal significance test for the differences in circadian activ- 
ity patterns continuously over time. The blue dashed 
and dotted lines correspond to a global and point-wise 
test of significance at significant level a = 0.05, respec- 
tively, and the red solid curve represents the observed 
statistic F(t) at each time point. When F(t) is above the 
blue dashed or dotted line, it is concluded the two 
apnea groups have significantly different mean circadian 
activity patterns at those time points. The global critical 
value (blue dashed line) is preferred since this represents 
a more conservative test. For these data, the two apnea 
groups are statistically different in activity from approxi- 
mately 7 AM - 9 PM. 

The statistical and computational details for fitting 
FLM models are well described elsewhere and are out- 
side the scope of this paper. The reader interested in 
these details are referred to Ramsay and Silverman [12]. 

This illustration was meant as an introduction to the 
methodology only, and not an indicator of a clinical 
conclusion. In the following section, these methods are 



applied to the entire 395 subject dataset, and show how 
apnea and BMI clinically impacts circadian activity 
patterns. 

3. Results 

3.1 Demographic Information 

Table 1 shows basic demographic information and sam- 
ple characteristics. Baseline covariates have been col- 
lected from 395 participants (196 females), age ranging 
from 18 to 80 years old. The average apnea score is 22.1 
(standard deviation = 28.1) and average BMI is 34.7 
(standard deviation = 8.9). Clinically, BMI > 30 is used 
to separate subjects into obese and non-obese cate- 
gories. However, subjects in our database were recruited 
from a sleep center and had higher BMI than found in 
the general population, so we cannot generalize our 
conclusions of the impact of BMI on circadian activity 
to the entire population. 



Table 1 Demographic information and sample 
characteristics 



Variable 




N (%) Mean ± std 
(N Total 395) 


Female 




196(49.87%) 


Race 


African-American 


134 (35.08%) 




Caucasian 


237 (62.04%) 


Presenting Symptoms 


Snoring 


279 (70.63%) 




Gasping 


93 (23.54%) 




Morning headache 


67 (16.96%) 




RLS symptoms 


26 (6.58%) 




PLMS 


3 (0.76%) 




Witnessed apneas 


146 (36.96%) 




Insomnia 


42 (10.63%) 




Excessive day sleepiness 


91 (23.04%) 




Non restorative sleep 


9 (2.28%) 


Mallampati score 


Class 4 


145 (41.55%) 




Class 3 


136 (38.97%) 




Class 2 


53 (15.19%) 




Class 1 


15 (4.30%) 


Diagnosis Result 


OSA 


292 (73.92%) 




RLS 


5 (1.27%) 




Insomnia 


8 (2.03%) 




Hypersomnia 


20 (5.06%) 


BMI > 30 




241 (60.86%) 


BMI 




34.66 ± 8.88 
(Median = 34) 


Age(years) 




47.9 ± 14.8 


apnea 




22.11 ± 28.11 
(Median = 12.95) 
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Figure 4 Smoothed Activity for individuals as black solid 
curves and overall mean as red curves. 



3.2 Smoothed Functional Actigraphy Data 

Raw actigraphy data were read into the R statistical soft- 
ware for analysis using the FDA package and software 
written by our group to apply FLM methods. Two hun- 
dred and eighty nine patients have actigraphy data. Each 
patient's data from midnight Monday through midnight 
Friday were averaged and fit by a 9 basis Fourier expan- 
sion and their circadian activity patterns plotted in Fig- 
ure 4. The mean circadian activity pattern across all 
subjects is shown by the red line. While general struc- 
ture is visible (e.g., lower activity during sleep hours), 
the overlap of these curves makes clinically meaningful 
interpretation difficult. 

3.3. Functional Liner Model (FLM) Results 

We apply FLM to measure the impact of apnea and 
BMI on subject circadian activity patterns and test the 
null hypothesis that circadian activity patterns are the 
same regardless of apnea and BMI values. The alterna- 
tive hypothesis is that apnea, BMI, and/or their interac- 
tion effect activity behavior in a statistically significant 
way. In addition to the tests of hypotheses, FLM pro- 
vides a graphical view of the subgroup circadian activity 
patterns that can aid interpretation of behavioral 
differences. 

To fit these models, each subject is categorized 
according to their apnea and BMI values by: 



AHI 
BMI 



1 if AHI < Median of AHI 

- 1 if AHI > Median of AHI 

1 ifBMI<30 

- 1 if BMI > 30. 



For the 295 subjects, 235 subjects had data on apnea 
and actigraphy, 277 subjects had BMI and actigraphy, 
and 232 subjects had apnea, BMI, and actigraphy. The 
following analyses are based on these subsets. 

We fit the following three functional linear models as 
defined in Table 2. The first two models measure the 
univariate impact of apnea (N = 235) and BMI (N = 
277) separately, and the third model measures their 
multivariate impact (N = 232). The models are pre- 
sented in this order to go from a less to more compli- 
cated analysis. Converting apnea and BMI into binary 
categories was done for simplification but is not neces- 
sary for functional linear modeling, and continuous 
apnea, BMI, or other covariates could be used. At the 
end, we show how BMI can be analyzed by FLM as a 
continuous variable. 
3.3. 1 Apnea Main Effect Models 

The impact of apnea as a main effect on circadian activ- 
ity patterns was tested with Model 1, Table 2. The null 
hypothesis is that the circadian actigraphy patterns are 
the same in the two apnea groups. Of the 235 subjects 
in this analysis, 118 have apnea less than the median 
apnea = 10.8, and 117 patients have apnea larger than 
or equal to 10.8. 

Figure 5 presents the estimated group means with 95% 
confidence bands in plot (a). The low apnea group indi- 
cating less disease severity (red solid line) has higher 
activity during the day compared to the high apnea 
group (blue solid line). The confidence bands around 
the two group mean curves do not overlap during the 
day suggesting the variability in the group circadian 
activity patterns do not cross. The F-test in the plot (b) 
indicates when these curves are statistically different 
during the day. The F-test result shows that the two 
apnea groups are significantly different from about 7 
AM to 9 PM. 
3.3.2. BMI main effect 

Next, the impact of BMI as a main effect on circadian 
activity patterns was measured using Model 2, Table 2. 
The null hypothesis is that the circadian activity pat- 
terns are the same in non-obese (BMI < 30) and obese 
(BMI > = 30) groups. 182 patients are classified as obese 



Table 2 Three Functional Linear Models 



Model 1 


apnea Main Effect Only 


Activity^ = p 0 (t) + /Wt) x AHI k + s k (t) 


Model 2 


BMI Main Effect Only 


Activity^) = p 0 (t) + /Wt) x BMI k + s k (t) 


Model 3 


apnea+BMI+interaction 


Activity^) = p 0 (t) + /Wt) x AHI k + /Wt) x BMI k +p AHI x BMI (t) x AHI k x BMI^ + s k (t) 
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Activity- AH I 



— ' AH I >=10.8 95% CB 

AH I < 10.5 

AH I <10.9 95% CB 




Midnight 



Noon 
(a) 

Permutation F-Test 



Midnight 



I 

d. = - 



Observed Statistic 
pointwise 0.05 critical value 
maximum 0.05 critical value 




Midnight 



Midnight 



Figure 5 FLM result for apnea main effect model. Plot (a) is 
estimated activity patterns for two apnea groups and 95% 
confidence band. Plot (b) is F-test result for this model. 



Table 3 Sample size for apnea, BMI mode 



and 95 as non-obese. Figure 6 presents estimated group 
means with 95% confidence band and F-test result. The 
high BMI group (blue solid line) has higher activity dur- 
ing night and lower activity during daytime, but activity 
patterns for the two groups are only significantly differ- 
ent around 3 AM and 6 PM. 

We emphasize that the population of participants in 
this study had a higher overall BMI compared to the 
general population which may explain why the expected 



Activity- BMI 



Midnight 




Midnight 



Observed Statistic 
pointwise 0.05 critical value 
maximum 0.05 critical value 




Midnight 



Midnight 



Figure 6 FLM result for BMI main effect model. Plot (a) is 
estimated activity patterns for two BMI groups and 95% confidence 
band. Plot (b) is F-test result for this model. 





apnea Low 
(< 10.75) 


apnea High 
(> = 10.75) 


Total 


BMI > = 30 


61 


94 


155 


BMI < 30 


55 


22 


77 


Total 


116 


116 


232 



difference in circadian activity patterns across these 
groups was not observed. 
3.3.3 Apnea and BMI effect, with interaction 
Model 3, Table 2 was used to measure the impact of 
apnea, BMI, and the apnea x BMI interaction term on 
circadian activity patterns. The null hypothesis is that 
the circadian activity patterns measured are the same in 
the four apnea x BMI groups versus the alternative that 
apnea and/or BMI and/or their interaction impacts cir- 
cadian activity patterns in a statistically significant way. 
Table 3 shows the sample sizes for each group. 
This interaction model has four functional coefficients 

^o{t)J A Hi(t) f PBMi{t)J A HixBMi[t)^ which in combina- 
tion define the four clinical groups (e.g., low BMI and 
low apnea; low BMI and high apnea, etc). The four sub- 
groups' circadian activity can be estimated by adding or 
subtracting the functional coefficients as shown in Table 
4. 

When a subject's apnea or BMI is low, the functional 
coefficient for that factor is added to the mean activity 
pattern. When a subject's apnea or BMI is high, the 
functional coefficient for that factor is subtracted from 
the mean activity pattern. The interaction coefficient is 
added when apnea and BMI are concordant (high/high 
or low/low) and subtracted when apnea and BMI are 
discordant (low/high, high/low). Figure 7 shows the 
activity curves for each of the four groups defined 
according to their apnea/BMI status. The F-test shows a 
significant difference among these four group activity 
patterns between about 7 AM to 11 AM and 12:30 PM 
to 8 PM. 



Table 4 Four group circadian activity result 



apnea 


BMI 




Group Mean 




Low 


Low 


Po{t) 


+ /W(t) + /W(t) + 


Pahixbmi(X) 


Low 


High 


Po(t) 


+ PaHI (t) - $BMl(f) - 


Pahixbmi(1) 


High 


Low 


Poit) 


- Pari (l) + Pbmi(X) - 


Pahixbmi(1) 


High 


High 


Po(t) 


- Pahi (t) - Pbmi (t) + 


Pahixbmi(X) 
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Activity-AHI+BMI+AHI'BMI 




Midnight 



Midnight 



Permutation F-Test 



Observed Statistic 

pointwise 0.05 critical value 

maximum 0.05 critical value 




Midnight 



Midnight 



(b) 



Figure 7 FLM result for apnea and BMI model. Plot (a) is 
estimated activity patterns for the four groups and 95% confidence 
band. Plot (b) is F-test result for this model. 



Activity- BMI 




Midnight 



Observed Statistic 
pointiAiise 0.05 critical value 
maximum 0.05 critical value 




Midnight 



Midnight 



Figure 8 FLM result for BMI model treating BMI as continuous. 

Plot (a) is estimated activity patters for BMI groups. Plot (b) is F-test 
result for this model. 



It is an established statistical practice in a linear 
regression model to test the main effects of two covari- 
ates and the effect of the interaction of the two covari- 
ates. We extended this method to the functional linear 
model The comparisons of all 4 groups in this section 
are actually the evaluation of the combination of the 
main and interaction effects which should be consistent 
with a 2 -way ANOVA. 
3.3.4 BMI as a Continuous Variable 

As noted above, BMI showed little impact on circadian 
activity patterns which does not correspond to general 
clinical belief. This is most likely explained by the fact 
that our subject population has high BMI relative to the 
general population, so the distinction between obese 
and non-obese was less pronounced. In this section, we 
fit a functional linear model treating BMI as a continu- 
ous variable. BMI ranges from 17 to 67 in this dataset. 
Figure 8 presents estimated means and F-test result. In 
this plot, each color represents one BMI group. The lar- 
gest BMI group has higher activity during night and 
lower activity during daytime. BMI impact is significant 
around 1 AM to 4 AM and 4 PM to 8 PM. It is noted 
that the significantly different time periods are longer 
than those obtained from categorized BMI effect model. 

4. Discussion 

Traditionally, actigraphy data is transformed into sum- 
mary numbers, such as total sleep time, sleep efficiency, 
wake after sleep onset, and other measurements. These 
transformations allow data analysts to test hypothesis 



using simple classical statistical methods. However, large 
amount of information can be lost and problems of 
masking circadian patterns may arise. 

The merit of functional linear modeling relies in 
determining when along the 24-hour scale groups differ. 
Results from parameter tests in a cosinor approach 
would provide information as to differences in harmonic 
content between groups. Another advantage of the func- 
tional linear modeling approach is exemplified in Figure 
8, where BMI is used as a variable instead of comparing 
groups with higher versus lower BMI values. 

In this paper we have presented a novel approach for 
analyzing the full actigraphy data which we believe 
avoids significant information loss and masking effect. 
Representing actigraphy data as smooth continuous 
functions, and applying Functional Linear Modeling 
methods allowed us to directly compare and test differ- 
ences of circadian activity patterns across apnea and 
BMI subgroups. Other Functional Data Analysis meth- 
ods using principal components analysis ( [15]; Zeitzer, 
et al. 'Phenotyping apathy in individuals with Alzhei- 
mer's using functional principal component analysis', 
Revised and Resubmitted) for identifying sources of 
variability within circadian activity patterns across sub- 
groups, and mixed effect models (Ding, et al., 'Func- 
tional Linear Mixed Effects Model for Actigraphy Data', 
In Preparation) for incorporating additional sources of 
within subject variability are currently being developed 
in our lab and applied to this type of data. Functional 
linear mixed models are also being developed in our lab 
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which will allow within-subject variability such as day- 
to-day or pre-treatment to post-treatment differences in 
activity to be analyzed. 
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